Picture for Sitong Gong

Sitong Gong

Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?

Add code
May 21, 2026
Viaarxiv icon

Towards Interactive Intelligence for Digital Humans

Add code
Dec 15, 2025
Viaarxiv icon

Living the Novel: A System for Generating Self-Training Timeline-Aware Conversational Agents from Novels

Add code
Dec 08, 2025
Figure 1 for Living the Novel: A System for Generating Self-Training Timeline-Aware Conversational Agents from Novels
Figure 2 for Living the Novel: A System for Generating Self-Training Timeline-Aware Conversational Agents from Novels
Figure 3 for Living the Novel: A System for Generating Self-Training Timeline-Aware Conversational Agents from Novels
Figure 4 for Living the Novel: A System for Generating Self-Training Timeline-Aware Conversational Agents from Novels
Viaarxiv icon

Parameter Aware Mamba Model for Multi-task Dense Prediction

Add code
Nov 18, 2025
Figure 1 for Parameter Aware Mamba Model for Multi-task Dense Prediction
Figure 2 for Parameter Aware Mamba Model for Multi-task Dense Prediction
Figure 3 for Parameter Aware Mamba Model for Multi-task Dense Prediction
Figure 4 for Parameter Aware Mamba Model for Multi-task Dense Prediction
Viaarxiv icon

The Devil is in Temporal Token: High Quality Video Reasoning Segmentation

Add code
Jan 15, 2025
Figure 1 for The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
Figure 2 for The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
Figure 3 for The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
Figure 4 for The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
Viaarxiv icon

AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation

Add code
Jan 14, 2025
Figure 1 for AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation
Figure 2 for AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation
Figure 3 for AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation
Figure 4 for AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation
Viaarxiv icon